x

k-dimensional Equidistribution

Source: https://www.pcg-random.org/party-tricks.html

k-dimensional Equidistribution

A common use case for a random number generator is to provide points in a k-dimensional space, for some k. Any good RNG ought to be well distributed across k dimensions (i.e., there should be no pattern to the k-tuples we see), but some users would prefer a much stronger property: that over the full period of the generator, every possible k-tuple will occur, and they will all occur the same number of times. This property is known as k-dimensional equidistribution. Let's look at why this might be a desirable property.

There is a saying that if you had an infinite number of monkeys typing on an infinite number of typewriters, they would produce all the great works of literature (and an inconceivable amount of dreck, too). We can cast that scenario into the world of random number generation. Suppose we have a generator that outputs 32-bit values (i.e., four bytes), and we grab its output in chunks of 16384 values at once. Each chunk will thus be 64 KB in size. If we demand that the generator be 16384-dimensionally equidistributed, we can know that all possible 64 KB sequences of data must show up eventually over the full period of the generator, which must be at least 216384ร—32 = 2524288. Within that immensely huge collection of outputs lies every valid 64 KB zip file, some of which will contain great literature such as Hamlet. Thus, to make the saying more accurate, you don't need an infinite number of monkeys (k-tuples) to produce the works of Shakespeareโ€”2524288 is ample.

An argument for k-dimensional equidistribution goes like this: Suppose you bought a lottery ticket every weekโ€”how would you feel if you discovered that the clerk at the store was handing you a fake ticket and pocketing the money because, at 259 million to one, you were never going to win anyway. You might, rightly, feel cheated. Thus as unlikely as any particular k-tuple might be, we ought to have a real chance, however remote, of producing it.

An argument against providing k-dimensional equidistribution (for large k) is that infinitesimal probabilities aren't worth worrying about. You probably aren't going to win the lottery, and your monkeys won't write Shakespeare.

But what if rather than blindly search for Shakespeare, we manufacture a generator that is just on the cusp of producing it. With the PCG family, that's possible!

Left-click: follow link, Right-click: select node, Scroll: zoom
x